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Abstract 

It is known that storage capacity per synapse increases by synaptic pruning in the case 
of a correlation-type associative memory modek However, the storage capacity of the entire 
network then decreases. To overcome this difficulty, we propose decreasing the connecting 
rate while keeping the total number of synapses constant by introducing delayed synapses. In 
this paper, a discrete synchronous-type model with both delayed synapses and their prunings 
is discussed as a concrete example of the proposal. First, we explain the Yanai-Kim theory by 
employing the statistical neurodynamics. This theory involves macrodynamical equations for the 
dynamics of a network with serial delay elements. Next, considering the translational symmetry 
of the explained equations, we re-derive macroscopic steady state equations of the model by 
using the discrete Fourier transformation. The storage capacities are analyzed quantitatively. 
Furthermore, two types of synaptic prunings are treated analytically: random pruning and 
systematic pruning. As a result, it becomes clear that in both prunings, the storage capacity 
increases as the length of delay increases and the connecting rate of the synapses decreases 
when the total number of synapses is constant. Moreover, an interesting fact becomes clear: 
the storage capacity asymptotically approaches 2/tt due to random pruning. In contrast, the 
storage capacity diverges in proportion to the logarithm of the length of delay by systematic 
pruning and the proportion constant is A/tt. These results theoretically support the significance 
of pruning following an overgrowth of synapses in the brain and strongly suggest that the brain 
prefers to store dynamic attractors such as sequences and limit cycles rather than equilibrium 
states. 

keywords 

associative memory, neural network, delay, synaptic pruning, statistical neurodynamics 

1 Introduction 

Robustness against noise and damage is often given as a positive feature of neural networks. 
Therefore, it is important to analyze neural networks with respect to synaptic pruning. Par- 
ticularly in the case of correlation-type associative memory models with randomly pruned 
synapses have been discussed in detail 01 IS As a result, it became quantitatively clear 
that synapse efficiency, which is defined by storage capacity per synapse, increases by synaptic 
pruning, although storage capacity of the entire network decreases. 

On the other hand, it has often been observed that synapses are pruned following an over- 
growth in real neural systems [SlinilIllHlinilII3IIIllI21II31IIlllIS]- Though the functional 
significance of this phenomenon is not known, Chechik et al. recently proposed the follow- 
ing hypothesis They considered cutting synapses that are lightly weighted after learning 
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with an excess of synapses, expecting synapse efficiency to increase by such systematic prun- 
ing. Therefore, they hypothesized that increasing synapse efficiency in this way adds functional 
significance to synaptic pruning fohowing an overgrowth. They used a cor relation- type auto- 
associative memory model to verify this hypothesis. After correlation learning, they left heavily 
weighted synapses in the model, in which all neurons are fully connected. Through computer 
simulations, they showed that the synapse efficiency increased by obtaining storage capacity. 
In this paper, synaptic pruning as described above is called systematic pruning. Although the 
hypothesis of Chechik et al. is interesting as neuroscience, there are some unclear or imperfect 
points from the theoretical viewpoint: for example, what degree of systematic pruning is more 
efficient than random pruning. Accordingly, Mimura et al. analyzed this system by using the 
self-consistent signal to noise analysis (SCSNA) j2l|, which is a method of statistical mechanics. 
They showed that systematic pruning increased synapse efficiency by the order of — ln(l — R) 
over random pruning at the limit when R approached unity, where R {0 > R > 1) was the 
rate of synaptic pruning. The important point in this case is that the storage capacity of the 
entire network decreased, though synapse efficiency increased by random pruning or systematic 
pruning. 

To overcome this difficulty, we propose decreasing the connecting rate while keeping the 
total number of synapses constant by introducing delayed synapses with respect to a discrete 
synchronous-type model. In this model, the storage capacity is expected to grow with increases 
in synapse efficiency because synapse efficiency increases by synaptic pruning, while the to- 
tal number of synapses remains constant. The discrete synchronous-type model with delayed 
synapses [1611241 0H| was proposed by Fukushima ^Hl- Yanai and Kim 'i^^ theoretically analyzed 
this model with the statistical neurodynamics |22]. Their theory closely agrees with the results 
of our computer simulation. 

In this paper, after defining the model, we explain the Yanai-Kim theory |24[ I25| I26j using 
the statistical neurodynamics j22j . which involves macrodynamical calculations for a network 
with delayed synapses. The Yanai-Kim theory needs a computational complexity of 0{LH) to 
obtain the macrodynamics, where L and t are the length of delay and the time step, respectively. 
Therefore, this theory is intractable for discussing macroscopic properties at the limit where L is 
extremely large |25| l27j. Thus, considering the translational symmetry of time steps, which holds 
in the steady state of the Yanai-Kim theory, we re-derive the macroscopic steady state equations 
by employing the discrete Fourier transformation, where the computational complexity does not 
formally depend on L [^Sllini- Using the re-derived steady state equations, storage capacities 
can be quantitatively discussed even for a large L limit. 

Next, synaptic pruning in the delayed network is investigated theoretically, and storage 
capacities are evaluated quantitatively. We deal with two types of pruning: random pruning 
and systematic pruning. As a result, it becomes clear that in both types of pruning, storage 
capacity increases as the length of delay increases, while the connecting rate of synapses decreases 
where the total number of synapses is constant. Moreover, an interesting fact becomes clear: the 
storage capacity asymptotically approaches 2/7r by random pruning. In contrast, the storage 
capacity diverges in proportion to the logarithm of the length of delay L, that is, (4/7r) InL, by 
systematic pruning. 

2 Delayed Network 
2.1 Model 

The structure of the delayed network discussed in this paper is shown in Figure ^ This figure 
corresponds to the case of fully synaptic connections, meaning no synaptic pruning. The network 
has N neurons, and L — 1 serial delay elements are connected to each neuron. All neurons, as 
well as all delay elements, have synaptic connections with all neurons. In this neural network, 
all neurons and all delay elements change their states simultaneously, i.e., this network employs 
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a discrete synchronous updating rule. The output of each neuron is determined by 



F (^0 , (1) 
F{-) = sgn(-), (2) 

L-l N 



1=0 j=l 



where x* denotes the output of the ith. neuron at time t, and JIj denotes the connection weight 
from the Ith. delay elements of the jth neuron to the ith neuron. Here, sgn is the sign function 
defined as 

^S^(^) = | -1, n<0. 

In this paper, the limit — > oo is used unless stated otherwise. 

Let us consider the storing sequence of aN memory patterns, ^-"^ ^ —> 
^aN ^ Here, a and aN are the loading rate and the length of the sequence, respectively. Each 
component of is assumed to be an independent random variable that takes a value of either 
+1 or —1 according to the following probabilities. 

Prob[C = ±l] = i. (5) 

The synaptic weight J^j is determined by correlation learning: 

where q is the strength of the Ith delay step. 

Correlation learning is an algorithm based on the Hebb rule, and it is inferior to the error 
correcting learning in terms of storage capacity. However, as seen in 0, it is not necessary 
to re-learn all patterns that were stored in the past when adding new patterns. Furthermore, 
correlation learning has been analyzed by many researchers due to its simplicity. 



2.2 Dynamical Behaviors of Macroscopic Order Parameters by Statistical 
Neurodynamics 

In the case of a small loading rate a, if a state close to one or a set of the patterns stored as a 
sequence is given to the network, the stored sequence of memory patterns is retrieved. However, 
when the loading rate a increases, the memory fails at a certain a. That is, even if a state close 
to one or a set of the patterns stored as a sequence is given to the network, the state of the 
network tends to omit the stored sequence of memory patterns. Moreover, even if one or a set 
of the patterns itself is given to the network, the state of the network tends to leave the stored 
sequence of memory patterns. This phenomenon of the memory suddenly becoming unstable at 
a critical loading rate can be considered a kind of phase transition. Here, the storage capacity 
ac is defined as the critical loading rate where recall becomes unstable. 

We define the overlap, or direction cosine, between a state a;* = (x-) appearing in a recall 
process at time t and an embedded pattern = (^j^) as 

1 ^ 

i=l 

Using this definition, when the state of the network at time t and the fith pattern agree 
perfectly, the overlap is equal to unity. When they have no correlation, the overlap is 
equal to zero. Therefore, the overlap provides a means of measuring recall quality. 
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Amari and Maginu proposed the statistical neurodynamics for the associative memory model 
[221 1231 131j . This analytical method handles the dynamical behavior of the associative memory 
model macroscopically, where cross-talk noise is regarded as a Gaussian random variable with 
a mean of zero and a time-dependent variance of . They then derived recursive relations for 
the variance and the overlap. 

Yanai and Kim applied this method to the present model, and succeeded in obtaining macro- 
scopic state transition equations |24j. We will briefly explain their derivation as follows. 

The total input of the ith. neuron at time t is given as 



L-1 N 
1=0 j=l 

s'^-"' + 4, (8) 

L-1 

^Qm*:j, (9) 

(=0 

EqE^^'^-V (10) 

1=0 u^t 



The first term in (jH]) is the signal useful for recall, while the second term is cross-talk noise that 
prevents from being recalled. This procedure is called a signal-to-noise analysis. 
We can then use the first-order Taylor expansion regarding F(-) to obtain 



1 ^ 

1=1 

L-1 

t 



+ f^*E^''<-/-i' (11) 

l'=0 

1 ^ (^^^ ^ c 

iv'^^^ ^ E E A? 



u. 



1=1 y;=oj=i 

X E r'+'ejxj.-'-M , (12) 

1 ^ (^~^ ^ C 

iv^^ I EE77 

1=1 \i=o j=l 

X E er^+'ejx*-^-M , (13) 

where F'{-) is the differentiation of F{-). 

Taking the correlation in the cross-talk noise zj into account, we can derive the following 
macrodynamical equations using ((T])- (fT^ (see Appendix A). 



L-1 L-1 



a 



^^cici'Vt-i,t-i', (14) 



1=0 i'=o 
vt-i,t-i' = aSi^i' 

L-1 L-1 

X E E ^kCk'Vt-l-k-l,t-l' -k' -I 
k=0 k'=0 
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+ a {ci^v -lUt-v + cv ^i^iUt-i) , (15) 



2 1 / 



2^ 



Ut = \ exp , (16) 



L-l 



'^cimt-i, (17) 

where mj denotes m^. Vt_i^t_i' = '^^-Lt'^^fi-i'^^ji-v variance of the cross-talk noise. 

Ut is a kind of susceptibihty, which measures the sensitivity of neuron output with respect to 
the external input. If t < 0, mt = and Ut = 0. If k < 0, Ck = 0. If either /c < or A;' < 0, 
Vk,k' = 0. The expression erf (a;) = exp {—u"^) du denotes the error function. In this paper, 

the initial condition is that the states of all neurons and all delay elements are set to be the 
stored pattern sequences. In this case, m/ = 1 (/ = 0, • • • , L — 1) and vi^i = a (/ = 0, • • • , L — 1). 

2.3 Macroscopic Steady State Analysis by Discrete Fourier Transformation 
and Discussion 

The Yanai-Kim theory explained in the previous section, which involves the macrodynami- 
cal equations obtained by the statistical neurodynamics, needs a computational complexity of 
O(L^t) to obtain the macrodynamics shown in H14|) and ()15() . where L and t are the length of 
delay and the time step, respectively |24[ I25[ [26] . Therefore, in this method, it is difficult to in- 
vestigate the critical loading rate for a large L limit, i.e., the asymptotic behavior of the storage 
capacity in a large L limit. Thus, Miyoshi, Yanai and Okada considered the Yanai-Kim theory 
in a steady state and derived the macroscopic steady state equations of the delayed network. 
Furthermore, the storage capacity was analyzed for a large L by solving the derived equations 
numerically [2^1 HHI • 

We will briefly explain the derivation of the macroscopic steady state equations to make the 
present paper self-contained. 

For simplicity, let us assume that Q = 1, Z = 0, • • • , L — 1. In a steady state, Vt-i^t-V can be 
expressed as vi^ii because of the translational symmetry in terms of time step. Therefore, by 
modifying ((Tl)) and (|T^. we obtain 

L-l 

ct' = {L-\n\)v{n), (19) 

n=l-L 

V (n) = aSnfl 

L-l 

+ {L-\i\)v{n-i) + Uad{n), (20) 

i=l-L 

where n = l — l',i = k — k', v{n) denotes w„ and 6 is Kronecker's delta. 

Using the discrete Fourier transformation, we can obtain the steady state equations in terms 
of the network's macroscopic variables as p2j) - (|25|) (see Appendix B). 

^2 _ /■^ a[(l-C/)sin(7rx) + f/sin{(2L + l)7rx}][l-cos(2L7rx)]^^ 

^ ~ J_l sin(7rx) [2sin2(7ra;) - U^ {1 - cos(2L7rx)}] ^' ^ ^ 
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(23) 



s 



mL, 



(24) 



m 




(25) 



Though the derived macroscopic steady state equations include a simple integral, their com- 
putational complexity does not formally depend on L. Therefore, we can easily perform numer- 
ical calculations for a large L. Figure [2 shows the results of theoretical calculations in cases 
where L = 1,3 and 10, which are obtained by solving these equations numerically. Figure 01 
shows the results of computer simulations. In these Figures, the abscissa is the loading rate a. 
In the computer simulations, the number of neurons is = 500. The initial condition is that 
the states of all neurons and all delay elements are set to be the stored pattern sequences. The 
steady state overlaps rrioo are obtained by calculations with a sufficient number of steps. Eleven 
simulations were carried out for each combination of loading rate a and lengths of delay L. Data 
points • , o , ■ indicate the medians of the sixth largest values for L =1,3 and 10, respectively, 
in the eleven trials. Error bars indicate the third and ninth largest values in the eleven trials. 
In each trial, the loading rate is increased by adding new patterns. 

These figures show that the steady states obtained by the derived theory agree closely with 
those obtained by computer simulation. Therefore, in the case of a large L, only the theoretical 
calculations are executed. Figure |1] shows the results, while Figure El shows the relationship 
between the length of delay L and the storage capacity ac- 

Prom these figures, we can see that the storage capacity increases in proportion to the length 
of delay L with a large L limit and a proportion constant of 0.195. In other words, the storage 
capacity of the delayed network ac equals 0.195L when the length of delay L is large [25ll26] . 
Although the result indicating that the delayed network's storage capacity is in proportion to 
the length of delay L may be trivial, the fact that this result has been proven analytically is 
significant. Moreover, the proportion constant 0.195 is a mathematically significant number 
because it represents the limit of the delayed network's storage capacity. 

3 Synaptic Pruning 

3.1 Necessity of Analyzing Synaptic Pruning 

During brain development, the phenomenon of synaptic pruning following overgrowth jSliEl 13 
IHl El El El El El can be observed. Since this pruning following overgrowth seems to 
be a universal phenomenon occurring in almost all areas - visual cortex, motor area, association 
area, and so on ~ it is important to analyze synaptic pruning and to discuss its properties 
quantitatively. 

In real neural systems, some synaptic delay is inevitable. This property can be analyzed 
with a model that involves both delay elements and synaptic pruning. For example. Figure (HI 
shows that a delay of three time steps can be represented by pruning the first, second, fourth 
and fifth synapses, and a five-time-step delay can be represented by pruning the first, second, 
third and fourth synapses with a model whose length of delay is five. From this perspective, 
analyzing a model with both delay elements and synaptic pruning is significant. 

Moreover, in the case of a delayed network with no pruning, it is obvious that storage capacity 
increases as the length of delay L increases. On the contrary, it is interesting to analyze the 
storage capacity of a delayed network that has a constant number of synapses by introducing 
synaptic pruning. 

It has been reported that the synapse efficiency, which is defined as storage capacity per 
synapse, increases due to synaptic pruning in networks with no delay elements (21IS1- Two types 
of pruning can be considered, namely random pruning and systematic pruning, which are typical 
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methods Pl El EI • Mimura et al. @] showed that synapse efficiency converged to ^ by random 
pruning and diverged as |:(— 21nc) by systematic pruning at the hmit where the connecting 
rate c is extremely small. Here, the relation between connecting rate c and pruning rate R was 
given by c = 1 — R. The important point here is that the storage capacity of the entire network 
decreases, since the number of synapses decreases. 

In the following discussion, a delayed network with synaptic pruning is analyzed on the 
basis of the macrodynamical equations and macroscopic steady state equations re-derived in the 
former section. We consider two types of pruning - random pruning and systematic pruning - 
for synaptic pruning. 



3.2 Random Pruning 

In this section, synapses of a delayed network are randomly pruned. Random pruning of synapses 
can be realized without any complicated control mechanism, so it is important to investigate its 
effect on the dynamical behavior of pattern recall and storage capacity. 

In the random synaptic pruning model, synaptic connections are constituted as 

Prob[4, = !] = !- Prob[c^ . = 0] = c, (27) 



where c is the connecting rate. 
Modifying (j^H), we obtain 

ci {c\, - C 



Using (|28() , we obtain the total input of the ith neuron at time t as 

L-l N 



z=o j=l 

L-l Ci (c\. - C 



4 



+ EE^^^Eer^'e^x-' (29) 

L-l 

'^cirrit-i, (30) 
1=0 

EqE^'^-V (31) 

/=0 uj^t 



As in the case of a fully connected network, the first, second and third terms of H29() are useful 
signals for recall, cross-talk noise and new noise generated by synaptic pruning, respectively. 
In the third term of H29|). 



t-l _ p ^,t-l~i 
L-l 



E E ■^jk^k ' 
l'=0 ky^j 
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' L-1 



X 



l'=0 kf^j 



l'=0 k^j 



X 



where 



^i'=o k^j u^^l 

^fEE^Er"''«K-'-'' 

\l'=0 k^j ui^fi 



Using the third term of becomes 



EE ^ 



/=0 A» 



L-1 



= EE 

L-1 

+ EE 



=0 fc^j 



The second term of (|35p becomes 

L-1 L-1 



1 



1=0 l'=0 



where E 



'ik-^k 



and E 



^k^k 



E 



(32) 



(33) 
(34) 



(35) 



(36) 



cj c^'fcC-'^;'} "^^'J obey N (0, l/A^) and (0, O {l/N)), re- 
spectively. Therefore, the second term of p5() becomes for a large limit. Here, E[-] and 
A^(a, cr^) stand for an average and a Gaussian distribution with average a and variance o"^, 
respectively. 

Using this result and (jSSI), the third term of (|29|) becomes [S] 



L-1 

EE 

1=0 j^i 



Nc 



(37) 



\ 1=0 / 

As a result, we can obtain the macrodynamical equations for random pruning as follows. 



2 q(1 -c) ^ 2 

L-1 L-1 

E E ^i^i''"t-i,t-i', 

1=0 l'=0 



(38) 
(39) 
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L-l L-1 

X ^ ^ CkCk'Vt-l-k-l,t-l'-k'-l 
k=0 k'=0 

+ a{ci^i'-iUt-i' +ci>^i^iUt-i) , (40) 



Ut = A/z^exp(-^-^), (41) 



L-l 



^cimt_i, (42) 

where the initial conditions are the same as in the case of a fully connected network. S is 
Kronecker's delta. Equation (|38() means that the variance (t| after pruning is the sum of the 
variance of cross-talk noise among patterns and the variance of new noise generated by pruning. 
Using the discrete Fourier transformation, as for a fully connected network, the macroscopic 
steady state equations in the case of random pruning become 



a' 



.' + ^i^T_cl (44) 



1=0 

pYn I 1 



U = W--exp -— , (45) 



2^2 / ' 



s = rriL, (46) 

m = erf I — ^ ) , (47) 
\V2aJ 

where cj^ is given by (|22|) . 

It is obvious that storage capacity increases with the length of delay L if the connecting 
rate c is constant. Therefore, the storage capacity ac is investigated under the condition that 
c X L is constant. This means that (|44l) - ()47|) are solved numerically and that the steady state 
overlaps rrioo are investigated by using c = 1/L, where q = 1. Figure [3 shows the results of 
theoretical calculations and computer simulations when L = 1,2,3,5 and 10. In this figure, the 
abscissa is loading rate a. In the computer simulations, the number of neurons is iV = 500. 
The initial condition is that the states of all neurons and all delay elements are set to be the 
stored pattern sequences. The steady state overlaps rrioo are obtained by calculations with a 
sufficient number of steps. Eleven simulations were carried out for each combination of loading 
rates a and lengths of delay L. Data points • , o , ■ , □ , * indicate the medians of the sixth 
largest values for L =1,2,3,5 and 10, respectively, in the eleven trials. Error bars indicate the 
third and the ninth largest values in the eleven trials. In each subsequent trial, the loading rate 
is increased by adding new patterns. 

Figure [7| displays the following results. In the case of L = 1 (c = i = l.O) , which is fully con- 
nected with no delay elements, the recurrent neural network's storage capacity ac for sequential 
association is 0.269. This agrees with the results of the previous works j291 1301 IHT] . As the length 
of delay L increases, storage capacity ac increases even though the total number of synapses is 
constant. This phenomenon is due to the time lag of synaptic inputs by delays, which reduces 
the statistical correlation among synaptic inputs. As a result, variance of the noise component 
decreases. This figure shows that theoretical results closely agree with the simulation results. 
Therefore, only a theoretical calculation is executed when the length of delay L is large, and the 
results of this calculation are shown in Figure |HI 
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The properties of L = oo in Figure |H1 are obtained as follows. The first term of the r.h.s. 
of H44() can be written as cr^ = ar from H22|) . Therefore, we first numerically investigated the 
dependence of r on L. Figure IHl shows the results. The straight line in this figure shows a 
first-order approximation, which is obtained by using a least squares method, of the relation 
between logL and logr at the phase transition point. We can see that r is 0(L^'^^) by reading 
the slope of the line. On the other hand, considering c = 1/L and q = 1, when L is extremely 
large, the second term of the r.h.s. of (|44|) becomes aL^, that is 0{L'^). Therefore, only the 
second term is effective in the r.h.s. of (|44|) when L is extremely large, and (|44j) becomes 

aL^. (48) 

Based on these considerations, the properties of L = oo in Figure |H1 are obtained by ignoring 
the first term in the r.h.s. of ()44|). Figure |H1 shows that the steady state overlap asymptotically 
approaches that of L = cxd obtained above as L becomes large. 

Now, in the case of L = oo, the storage capacity can be obtained analytically as follows. 
Substituting (|i6|) into (|17|) . we obtain 

m = erf f | . (49) 
\V2aJ 

Equation ()49|) has nontrivial solutions m ^ within the range where the slope of the r.h.s. 
at m = is greater than 1. Here, the slope of the r.h.s. of H49|) regarding m can be written as 

Therefore, we can obtain the critical value of the noise as 

~al = h'. (51) 
vr 

Prom ()48|) and ()51() . the storage capacity ac of random pruning at the limit when L ap- 
proaches oo is obtained as 

ac = - ~ 0.637. (52) 
vr 

Figure IHl shows that the storage capacity approaches this value asymptotically as L increases. 
3.3 Systematic Pruning 

Chechik et al.0 discussed the functional significance of synaptic pruning following overgrowth 
on the basis of a cor relation- type associative memory model. They pointed out that synapse 
efficiency, which is storage capacity per synapse, increases by cutting synapses that are lightly 
weighted after correlation learning. 

This type of systematic pruning can be expressed by nonlinear function /(•) shown in Figure 
[TUl Synapses in the range of —Zth < z < +Zth are pruned by /(•). In this case, synaptic 
connections are constituted by 

4- = ^^^/(^'l' (53) 




Tl, = ^EC^'^'^r (54) 



Equation H54|) is a stochastic variable that obeys normal distribution A^(0,1). Therefore, the 
relationship between the connection rate c and zth is given by 

Dz = l- erf f ^) , (55) 
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where Dz stands for -^7^ exp y—^j dz, and the integral is from —00 to +00. 
Modifying the connection weight Jlj jH] , we obtain 



ciV aN 



N 



JTl^ + ifiTl 



TT-- 



(56) 



where 



J 



Dxf' (x) 

°° dx 
.00 \p2M 
°° dx 



Dxxf (x) . 



-fix) 



■xf{x) 



(57) 



Using these modification, we obtain the total input of the ith neuron at time t is given as 

L-l N 



1=0 j=i 

/=o j=l 
L-l N 



JThUx'-' 



(58) 



E E 7^ E«r 

«=0 j=l fJ. 

L-l N 



+ EE 



N 



f (tI) - JT, 



(59) 



1=0 j=i 

\ 1=0 

L-l N 



+ EE^(/(n 



t+i 



i ''''u-l 



=0 3=1 



(60) 



As in the case of a fully connected network, the first, second and third terms of ()6U() are 
useful signals for recall, cross-talk noise and new noise generated by nonlinear transformation, 
respectively. Here, the average of the third term equals 0, and the variance equals (S] 



E 



E 



' L-l N /-^ 

EE^f/fr; 



/=o j=l 
L-l 



N 



«e4e(/(^)-^^ 

1=0 j 
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L-1 

a 

1=0 
L-l 



Y^cf [ Dx{f{x)-Jxf 
aY^cf (^J Dxf{xf -'^J j Dxxf{x) + j Dxx' 



=0 
L-l 



(xf - 2J2 + j2 



1=0 

c^{j'-j')Y^l (61) 



1=0 

where = J Dx {f {x))^ . As a result, we can obtain the macrodynamical equations for sys- 
tematic pruning as follows. 

= aj + a[P-J^)Ycl (62) 

1=0 

L-l L-l 

(63) 

(64) 
(65) 

vt-ut-v 

+ U 

L-l L-l 

X ^ ^ CkCk'Vt-l-k-l,t-l' -k' -1 
k=0 k'=0 

+ a{ci_v_iUt-v + cv_i_iUt-i) , (66) 
2 1 / (.t-i\^\ 

■KGt-l \ 2(Tj^_^ 









1=0 l'=0 


J2 = 


J Dx{f{x))\ 


J = 


j Dxxf (x) , 


I' = 




+ 


Ut-iUt-u 



Ut = \l-^exp\-^-^], (67) 



L-l 

s* = J ^ cimt-i, 

1=0 

m.^, = erf(-^), (69) 

where the initial conditions are the same as in the case of a fully connected network. 6 is 
Kronecker's delta. Equation ()62() means that the variance ct| after pruning is the sum of the 

variance ctj of cross-talk noise among patterns and the variance a (^J^ — J^^ 'l2b=o ^ '^^^ 
noise generated by pruning. Using the discrete Fourier transformation, as in the case of full 
connections, the macroscopic steady state equations in the case of systematic pruning become 

a' = + (70) 
21 ( 



s = ml, (72) 
m = erf(-^), (73) 
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where cr^ is given by (|^^ . 

As for random pruning, the storage capacity ac is investigated under the condition that c x L 
is constant. This means that ((7n |) -(f75 |) are solved numerically, and the steady state overlaps moo 
are investigated by using c = l/L, where q = 1. Figure ITT] shows the results of theoretical 
calculations and computer simulations when L = 1, 2, 3, 5 and 10. In this figure, the abscissa is 
the loading rate a. In the computer simulations, the number of neurons \?, N = 500, and the 
steady state overlaps moo are obtained by calculations with a sufficient number of steps. Eleven 
simulations were carried out for each combination of loading rate a and length of delays L. Data 
points • , o , ■ , □ , * indicate the medians of the sixth largest values for L =1,2,3,5 and 10, 
respectively, in the eleven trials. Error bars indicate the third and the ninth largest values in 
the eleven trials. In each trial, the loading rate is increased by adding new patterns. 

Figure lTTl shows that as the length of delay L increases, storage capacity ac increases, though 
the total number of synapses is constant. This figure also shows that theoretical results closely 
agree with the simulation results. Therefore, only a theoretical calculation is executed when the 
length of delay L is large. Figure IT^ shows the results. Figure IT^ shows the relationship between 
the length of delay L and the storage capacities. 

Here, we investigate the dependence of the first and second terms of the r.h.s. of H70|) on L 
in the same manner as random pruning to obtain the asymptotic storage capacity analytically 
when L is extremely large. 

The first term of the r.h.s. of H70|) can be written as cP' = ar from ()22() . Therefore, we first 
numerically investigate the dependence of r on L. Figure El shows the results. The straight 
line in this figure shows a first-order approximation, which is obtained by using a least squares 
method, of the relation between logL and logr at the phase transition point. We can see that 
r is 0{L^'^^) by reading the slope of the line. 

On the other hand, the dependence of the second term on L can be obtained as follows. 
When L is extremely large, that is, when c = 1/L is extremely small and Zf^ is extremely large, 
the connection rate c of (|55j) is as follows. 

Dz 

Zth 



1 — erf 



V^/2. 

Zhh exp ( - ^ ) , zth^ oo. (74) 



J and J2 of (jESl) and (jMI become 



J = 2 I Dzz'^ 



2 . _____ f z^f^\ ^ ^ ( Zth 



^.t.exp^--j+l-erf^^ 

^zthdy^-V {-^^y) ' ^th ^oo, (75) 



P = 2 Dzz^ 



= J. (76) 

Considering c = 1/L and q = 1, from (fTijl C (f75|) and (|76|) . the second term of (f70|) can be 

transformed as 

a(-j-l\L 
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a 



a 



a 



1 

4 c 



1 L 



1 L 



1 



4c 



L 



1 



a- 



-2clnc 

L2 



-L 



21nL 



(77) 



Since (|77() is 0{Lp'), only the second term is effective in the r.h.s. of H7U() when L is extremely 
large. Based on these considerations, the storage capacity of systematic pruning can be obtained 
as follows. Substituting ((7^ into (|73() . we obtain 



m = erf 



/ mL 



a 



(78) 



Equation (|78|) has nontrivial solutions m 7^ within the range where the slope of the r.h.s. 
at m = is greater than 1. Here, the slope of the r.h.s. of (|78|) regarding m can be written as 



d ^ / mL 
dm \\/2a 



L 2 
-W-exp 
a \ TT 



m^Li^ 



Therefore, we can obtain the critical value of the noise o"^ as 



a. 



Ilk 

IT 



(79) 



(80) 



From (|77() and ()8U() . the storage capacity ac of systematic pruning at the limit when L 
approaches 00 is obtained as 

A 

(81) 



ac 



4 

-InL. 

vr 



Figures shows that the storage capacity of large L is parallel with the line of - In L when 
L is large. This means that the storage capacity approaches f In L relatively, and this result 
supports the derived theory. 

As the length of delay L increases, storage capacity ac increases, even though the total 
number of synapses remains constant; the tendency of increase is different from that of random 
pruning. Storage capacity is in proportion to the logarithm of the length of delay L, and the 
proportion constant is ^. In other words, for systematic pruning, storage capacity diverges with 
the increase in the length of delay L. It is amazing that the storage capacity diverges regardless 
of whether the total number of synapses is constant. 



4 Conclusions 

We analyzed a discrete synchronous-type model that adopts correlation learning by using the sta- 
tistical neurodynamics and discussed sequential associative memory by recurrent neural networks 
with synaptic delay and pruning. First, we explained the Yanai-Kim theory |2l], which involves 
macrodynamical equations for the dynamics of a network with serial delay elements. Next, con- 
sidering the translational symmetry of the explained equations, we explained the macroscopic 
steady state equations of the model by using the discrete Fourier transformation [23 [20] • The 
storage capacity was analyzed quantitatively. As a result, we showed that the storage capacity 
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is in proportion to the length of delay L when the L limit is large and the proportion constant 
is 0.195. Furthermore, two types of synaptic prunings were analyzed: random pruning and sys- 
tematic pruning. As a result, it became clear that under both pruning conditions, the storage 
capacity grows with an increase in delay and a decrease in the connecting rate when the total 
number of synapses is constant. Moreover, an interesting fact became clear: the storage capacity 
approaches 2/7r asymptotically by random pruning. In contrast, the storage capacity diverges 
in proportion to the logarithm of the length of delay by systematic pruning, and the propor- 
tion constant is 4/tt. These results theoretically support the significance of pruning following 
overgrowth of synapses in the brain El 13 El El ^1 ^1 ^1 strongly suggest 

that the brain prefers to store dynamic attractors such as sequences or limit cycles rather than 
equilibrium states. 



Appendix A: Derivations of the Macrodynamical Equations of 
the Delayed Network 



Using (|T?1|) and (|TT|l . we obtain 

ZA = 
ZB = 



ZA + ZB, (82) 

E-'EC^'^Er'-r'^"'' (83) 

1=0 ^i^t j 

E E ^r'u,~^ E ^^'<t';r\, (84) 

1=0 f_i^t l'=0 



wher e a:* ^'^^ is the variable obtained by removing the influence of ' from x* Using 
(IHl,® and (|H1, we obtain 

E[zl] = 0, (85) 



at = E 



(4) 'J (86) 
= E[zl + zl + 2zazb] . (87) 
Transforming z\^z^ and zaZb with consideration given to their correlation, we obtain 

L-l 

E[zl] = aY,cl (88) 

1=0 

L-l L-l L-l L-l 

^[^b] = E E E E E 

HjLt 1=0 l'=0 k=0 k'=0 

xC/*-.C/*-.m*ltl-\<-;rl^^^^^^ (89) 

L-l L-l 

E [2zazb] = "EE 

1=0 l'=0 

X {ci^v^iUt-v + ci^-i-iUt^i) , (90) 

where 

vt^l,t-l'=Y.^',~l^'^^~l'■ (91) 

Using (jHZll-dnni), we obtain 

L-l 

2 \ ^ 2 

= « 2^ 

1=0 



15 



L-lL-lL-1 L-1 

+ X] CiCi'CkCk' 

1=0 l'=0 k=0 k'=0 

xUt-iUt-i'Vt-[-k-i,t-i'-k'-i 

L-1 L-1 

+ « X] X] i^i-i'-i^t-v + cv_i_iUt-i) 

1=0 l'=0 



Using (fTn|) and (|HIt|1 . we obtain 

L-1 L-1 
O-f = X^ X^ ClCl'Vt-l,t-l'- 

1=0 i'=o 

Comparing (|^^ and as identical equations regarding qq/, we obtain 

vt-i,t-i' = oidi^v 

+ Ut-iUt-v 

L-1 L-1 

X X^ X] CkCk'Vt-l-k-l,t-l'-k'-l 
k=0 fc'=0 

+ a (ci-i'-iUt-i' + cu-i-iUt-i) , 
where 5 is Kronecker's delta. Using 1)13^ . we obtain 

Af /l-1 TV 



^E^ EE 

i=i \ i=o i=i 



(92) 



(93) 



(94) 



VT^/i— Z— 1 



E 



27r 
1 /■ 



(J .1 \/27r 
'2 1 



< F' (n*) > 

^2 



vr (Tt-i 



exp 



2-ti 



(95) 



where the variable obtained by removing the influence of from u*. Here, <C • ^ 

stands for the average over pattern 

As a result, we can obtain the macrodynamical equations for overlap m, that is, (|14j) - (|18)) . 



Appendix B: Derivations of the Macroscopic Steady State Equa- 
tions by Discrete Fourier Transformation 

Using the discrete Fourier transformation, we re-derive the general term of v{n), which is ex- 
pressed by the recurrence formula in ((20} |251 126j . Applying the discrete Fourier transformation 
to (jSni) and ((211), we obtain 

L-1 

V{r) = a + U^ {L-\i\)V{r)e"^^''^ +aUD{r), (96) 

j=l-L 
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D{r) = Yl d{n)e~^^^^ 

n=-T 
L 



n=l 



where V{r) and D{r) are the discrete Fourier transformations of v{n) and respectively. 
Solving (jU^ and (P7jl in terms of V{r), we obtain 



727r- 



F (r) = ^ ^ (98) 



Summations in (|!IH|l are calculated as follows. 

L 



n=l 

" sin((2L+l)^) 

--(#tt) ' ^ ' (99) 

2L, r = 0, 



When r 7^ 0, 



L-l 



2T+1 



=1-L 



,in((2L-l)^) 

2L- 1, 



r/0, 
r = 0. 



(100) 



E I 

i=l~L 
L-l 



2r + 1 5 44 



27r 9r 



j=l-L 



■ Q rz ■ r) rz 



.2r + l a sin ((L-l 



2-K dr 



Lcos (L - 1) 



sm 



( 7rr \ 



g J 2T+1 _ 2T+1 



V2T+1 ; 



2nr 
2T+1 



[L — 1) cos 



2L7rr 
2T+1 



2 sin^ 



2 / 7rr 



2r+i 



(101) 



When r = 0, 



L-l 

E 

i=l-L 



z e 



-j27r 



^5>T = L(L- 1). 



Substituting l|^ - (fTU^ into we obtain 



(102) 



y(r 



2asm(2f^)((l-l/)sm(2^)+C/sin((2L+l)5f^)) 



a(l+2C/L) 

1-C/2L2 



2sm2(^)-C/2(l-cos(i5^)) 



r/0, 
r = 0. 



(103) 
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Since the inverse discrete Fourier transformation of (|lU3j) equals v (n) , we obtain 

1 ^ 

v(n)= lim > y(r)e^^''^T+T. (io4) 



r=-T 

Substituting ((TUH) into (dHJ, we obtain 

L-l 

- V(r) 



= E E (^-H)e^'"^- (105) 



r=-T n=l-L 



Using (|lUUj) - (|lU5jl and rewriting 2T+1 ~^ ^ ' 2T+1 dx, we can express cr^ as a form using 
a simple integral like l\'22\i. As a result, we can obtain the steady state equations in terms of the 
macroscopic variables of the network as p2|l-(|2f~ 
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Figure Captions 

1. Structure of delayed network. 

2. Relationship between loading rate a and overlap m (theory). 

3. Relationship between loading rate a and overlap m (computer simulation). 

4. Relationship between loading rate a and overlap m. These lines are obtained by solving 
steady state equations numerically. 

5. Relationship between length of delay L and storage capacity a^. This line is obtained by 
solving the steady state equations numerically. Storage capacity is 0.195L, with a large L 
limit. 

6. Representation of delayed synapses by pruning: (a) length of delay is three, and (b) length 
of delay is five. 

7. Relationship between loading rate a and overlap m when synapses are randomly pruned 
(theory (t) and computer simulation(s)). 

8. Relationship between loading rate a and overlap m when synapses are randomly pruned 

(theory). 

9. Relationship between logL and logr when synapses are randomly pruned. 

10. Nonlinear function for systematic pruning. 

11. Relationship between loading rate a and overlap m when synapses are systematically 
pruned (theory(t) and computer simulation (s)). 

12. Relationship between loading rate a and overlap m when synapses are systematically 
pruned (theory). 

13. Relationship between length of delay L and storage capacity ac when synapses are sys- 
tematically pruned. 

14. Relationship between log L and log cr^ when synapses are systematically pruned. 



21 



Figures 



Neuron 



/ c 



j c 



TV C 



7.. 



J. 



j'. 

'J 



x: 



WWWrr^ ^ rl rl 7t ^ ^ if WWWW 



T ' 



y ;< y 



J 



Delay Element 
I I L-1 



Figure 1: Structure of delayed network. 



> 
O 



1 

0.8 
0.6 
0.4 
0.2 












L=l 








3 








10 









0.5 1 1.5 

Loading Rate 



Figure 2: Relationship between loading rate a and overlap m (theory). 
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Figure 3: Relationship between loading rate a and overlap m (computer simulation). 
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Figure 4: Relationship between loading rate a and overlap m. These lines are obtained by 
solving steady state equations numerically. 
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Figure 5: Relationship between length of delay L and storage capacity ac- This line is obtained 
by solving the steady state equations numerically. Storage capacity is 0.195L, with a large L 
limit. 
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Figure 6: Representation of delayed synapses by pruning: (a) length of delay is three, and (b) 
length of delay is five. 
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Figure 7: Relationship between loading rate a and overlap m when synapses are randomly 
pruned (theory (t) and computer simulation(s)). 




Figure 8: Relationship between loading rate a and overlap m when synapses are randomly 
pruned (theory). 




Figure 9: Relationship between log L and log r when synapses are randomly pruned. 
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Figure 10: Nonlinear function for systematic pruning. 
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Figure 11: Relationship between loading rate a and overlap m when synapses are systematically 
pruned (theory(t) and computer simulation(s)). 
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Figure 12: Relationship between loading rate a and overlap m when synapses are systematically 
pruned (theory). 
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Figure 13: Relationship between length of delay L and storage capacity ac when synapses are 
systematically pruned. 
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Figure 14: Relationship between log L and log when synapses are systematically pruned. 
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